Overview

Dataset statistics

Number of variables23
Number of observations184835
Missing cells792234
Missing cells (%)18.6%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory32.4 MiB
Average record size in memory184.0 B

Variable types

Numeric8
Categorical15

Warnings

sueldo_smdlv is highly correlated with otros_ingresos_smdlv and 1 other fieldsHigh correlation
otros_ingresos_smdlv is highly correlated with sueldo_smdlvHigh correlation
año_credito is highly correlated with sueldo_smdlvHigh correlation
sueldo_smdlv is highly correlated with otros_ingresos_smdlv and 1 other fieldsHigh correlation
otros_ingresos_smdlv is highly correlated with sueldo_smdlvHigh correlation
año_credito is highly correlated with sueldo_smdlvHigh correlation
valor_credito_smdlv is highly correlated with cuotasHigh correlation
cuotas is highly correlated with valor_credito_smdlvHigh correlation
municipio_expedicion is highly correlated with municipio_credito and 3 other fieldsHigh correlation
periodo_credito is highly correlated with RowHigh correlation
sueldo_smdlv is highly correlated with año_credito and 1 other fieldsHigh correlation
municipio_credito is highly correlated with municipio_expedicion and 2 other fieldsHigh correlation
Row is highly correlated with municipio_expedicion and 6 other fieldsHigh correlation
forma_pago is highly correlated with sectorHigh correlation
municipio_residencia is highly correlated with municipio_expedicion and 3 other fieldsHigh correlation
codeudor is highly correlated with RowHigh correlation
genero is highly correlated with RowHigh correlation
sector is highly correlated with forma_pagoHigh correlation
año_credito is highly correlated with sueldo_smdlv and 2 other fieldsHigh correlation
municipio_nacimiento is highly correlated with municipio_expedicion and 3 other fieldsHigh correlation
otros_ingresos_smdlv is highly correlated with sueldo_smdlv and 1 other fieldsHigh correlation
municipio_expedicion is highly correlated with municipio_nacimientoHigh correlation
sector is highly correlated with forma_pagoHigh correlation
municipio_credito is highly correlated with municipio_residenciaHigh correlation
municipio_nacimiento is highly correlated with municipio_expedicionHigh correlation
forma_pago is highly correlated with sectorHigh correlation
municipio_residencia is highly correlated with municipio_creditoHigh correlation
genero has 97541 (52.8%) missing values Missing
estado_civil has 102421 (55.4%) missing values Missing
edad has 107699 (58.3%) missing values Missing
municipio_expedicion has 93581 (50.6%) missing values Missing
tiene_casa_propia has 100463 (54.4%) missing values Missing
sueldo_smdlv has 111082 (60.1%) missing values Missing
otros_ingresos_smdlv has 179418 (97.1%) missing values Missing
Row is uniformly distributed Uniform
Row has unique values Unique

Reproduction

Analysis started2021-06-04 06:02:04.678186
Analysis finished2021-06-04 06:02:52.214424
Duration47.54 seconds
Software versionpandas-profiling v3.0.0
Download configurationconfig.json

Variables

Row
Real number (ℝ≥0)

HIGH CORRELATION
UNIFORM
UNIQUE

Distinct184835
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean92418
Minimum1
Maximum184835
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.4 MiB

Quantile statistics

Minimum1
5-th percentile9242.7
Q146209.5
median92418
Q3138626.5
95-th percentile175593.3
Maximum184835
Range184834
Interquartile range (IQR)92417

Descriptive statistics

Standard deviation53357.41284
Coefficient of variation (CV)0.5773487074
Kurtosis-1.2
Mean92418
Median Absolute Deviation (MAD)46209
Skewness0
Sum1.708208103 × 1010
Variance2847013505
MonotonicityStrictly increasing
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20471
 
< 0.1%
1508831
 
< 0.1%
1549771
 
< 0.1%
1529281
 
< 0.1%
443511
 
< 0.1%
423021
 
< 0.1%
484451
 
< 0.1%
463961
 
< 0.1%
361551
 
< 0.1%
341061
 
< 0.1%
Other values (184825)184825
> 99.9%
ValueCountFrequency (%)
11
< 0.1%
21
< 0.1%
31
< 0.1%
41
< 0.1%
51
< 0.1%
61
< 0.1%
71
< 0.1%
81
< 0.1%
91
< 0.1%
101
< 0.1%
ValueCountFrequency (%)
1848351
< 0.1%
1848341
< 0.1%
1848331
< 0.1%
1848321
< 0.1%
1848311
< 0.1%
1848301
< 0.1%
1848291
< 0.1%
1848281
< 0.1%
1848271
< 0.1%
1848261
< 0.1%

procedencia
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
Nacional
184680 
Extranjero
 
155

Length

Max length10
Median length8
Mean length8.001677172
Min length8

Characters and Unicode

Total characters1478990
Distinct characters13
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNacional
2nd rowNacional
3rd rowNacional
4th rowNacional
5th rowNacional

Common Values

ValueCountFrequency (%)
Nacional184680
99.9%
Extranjero155
 
0.1%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
nacional184680
99.9%
extranjero155
 
0.1%

Most occurring characters

ValueCountFrequency (%)
a369515
25.0%
o184835
12.5%
n184835
12.5%
N184680
12.5%
c184680
12.5%
i184680
12.5%
l184680
12.5%
r310
 
< 0.1%
E155
 
< 0.1%
x155
 
< 0.1%
Other values (3)465
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter1294155
87.5%
Uppercase Letter184835
 
12.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a369515
28.6%
o184835
14.3%
n184835
14.3%
c184680
14.3%
i184680
14.3%
l184680
14.3%
r310
 
< 0.1%
x155
 
< 0.1%
t155
 
< 0.1%
j155
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
N184680
99.9%
E155
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin1478990
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a369515
25.0%
o184835
12.5%
n184835
12.5%
N184680
12.5%
c184680
12.5%
i184680
12.5%
l184680
12.5%
r310
 
< 0.1%
E155
 
< 0.1%
x155
 
< 0.1%
Other values (3)465
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII1478990
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a369515
25.0%
o184835
12.5%
n184835
12.5%
N184680
12.5%
c184680
12.5%
i184680
12.5%
l184680
12.5%
r310
 
< 0.1%
E155
 
< 0.1%
x155
 
< 0.1%
Other values (3)465
 
< 0.1%

genero
Categorical

HIGH CORRELATION
MISSING

Distinct2
Distinct (%)< 0.1%
Missing97541
Missing (%)52.8%
Memory size1.4 MiB
Femenino
46934 
Masculino
40360 

Length

Max length9
Median length8
Mean length8.462345637
Min length8

Characters and Unicode

Total characters738712
Distinct characters12
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowFemenino
2nd rowFemenino
3rd rowFemenino
4th rowFemenino
5th rowFemenino

Common Values

ValueCountFrequency (%)
Femenino46934
25.4%
Masculino40360
21.8%
(Missing)97541
52.8%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
femenino46934
53.8%
masculino40360
46.2%

Most occurring characters

ValueCountFrequency (%)
n134228
18.2%
e93868
12.7%
i87294
11.8%
o87294
11.8%
F46934
 
6.4%
m46934
 
6.4%
M40360
 
5.5%
a40360
 
5.5%
s40360
 
5.5%
c40360
 
5.5%
Other values (2)80720
10.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter651418
88.2%
Uppercase Letter87294
 
11.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n134228
20.6%
e93868
14.4%
i87294
13.4%
o87294
13.4%
m46934
 
7.2%
a40360
 
6.2%
s40360
 
6.2%
c40360
 
6.2%
u40360
 
6.2%
l40360
 
6.2%
Uppercase Letter
ValueCountFrequency (%)
F46934
53.8%
M40360
46.2%

Most occurring scripts

ValueCountFrequency (%)
Latin738712
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
n134228
18.2%
e93868
12.7%
i87294
11.8%
o87294
11.8%
F46934
 
6.4%
m46934
 
6.4%
M40360
 
5.5%
a40360
 
5.5%
s40360
 
5.5%
c40360
 
5.5%
Other values (2)80720
10.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII738712
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n134228
18.2%
e93868
12.7%
i87294
11.8%
o87294
11.8%
F46934
 
6.4%
m46934
 
6.4%
M40360
 
5.5%
a40360
 
5.5%
s40360
 
5.5%
c40360
 
5.5%
Other values (2)80720
10.9%

estado_civil
Categorical

MISSING

Distinct5
Distinct (%)< 0.1%
Missing102421
Missing (%)55.4%
Memory size1.4 MiB
Union Libre
32184 
Casado
24804 
Soltero
24147 
Viudo
 
810
Divorciado
 
469

Length

Max length11
Median length7
Mean length8.258511903
Min length5

Characters and Unicode

Total characters680617
Distinct characters21
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowUnion Libre
2nd rowUnion Libre
3rd rowUnion Libre
4th rowUnion Libre
5th rowSoltero

Common Values

ValueCountFrequency (%)
Union Libre32184
 
17.4%
Casado24804
 
13.4%
Soltero24147
 
13.1%
Viudo810
 
0.4%
Divorciado469
 
0.3%
(Missing)102421
55.4%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
union32184
28.1%
libre32184
28.1%
casado24804
21.6%
soltero24147
21.1%
viudo810
 
0.7%
divorciado469
 
0.4%

Most occurring characters

ValueCountFrequency (%)
o107030
15.7%
i66116
9.7%
n64368
9.5%
r56800
 
8.3%
e56331
 
8.3%
a50077
 
7.4%
U32184
 
4.7%
32184
 
4.7%
L32184
 
4.7%
b32184
 
4.7%
Other values (11)151159
22.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter533835
78.4%
Uppercase Letter114598
 
16.8%
Space Separator32184
 
4.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o107030
20.0%
i66116
12.4%
n64368
12.1%
r56800
10.6%
e56331
10.6%
a50077
9.4%
b32184
 
6.0%
d26083
 
4.9%
s24804
 
4.6%
l24147
 
4.5%
Other values (4)25895
 
4.9%
Uppercase Letter
ValueCountFrequency (%)
U32184
28.1%
L32184
28.1%
C24804
21.6%
S24147
21.1%
V810
 
0.7%
D469
 
0.4%
Space Separator
ValueCountFrequency (%)
32184
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin648433
95.3%
Common32184
 
4.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
o107030
16.5%
i66116
10.2%
n64368
9.9%
r56800
8.8%
e56331
8.7%
a50077
7.7%
U32184
 
5.0%
L32184
 
5.0%
b32184
 
5.0%
d26083
 
4.0%
Other values (10)125076
19.3%
Common
ValueCountFrequency (%)
32184
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII680617
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o107030
15.7%
i66116
9.7%
n64368
9.5%
r56800
 
8.3%
e56331
 
8.3%
a50077
 
7.4%
U32184
 
4.7%
32184
 
4.7%
L32184
 
4.7%
b32184
 
4.7%
Other values (11)151159
22.2%

edad
Real number (ℝ≥0)

MISSING

Distinct72
Distinct (%)0.1%
Missing107699
Missing (%)58.3%
Infinite0
Infinite (%)0.0%
Mean44.54807094
Minimum18
Maximum90
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.4 MiB

Quantile statistics

Minimum18
5-th percentile27
Q135
median44
Q353
95-th percentile65
Maximum90
Range72
Interquartile range (IQR)18

Descriptive statistics

Standard deviation11.85931542
Coefficient of variation (CV)0.2662138938
Kurtosis-0.3869785761
Mean44.54807094
Median Absolute Deviation (MAD)9
Skewness0.3222662166
Sum3436260
Variance140.6433623
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
442967
 
1.6%
402646
 
1.4%
452483
 
1.3%
412428
 
1.3%
492369
 
1.3%
462315
 
1.3%
352303
 
1.2%
422248
 
1.2%
372211
 
1.2%
312152
 
1.2%
Other values (62)53014
28.7%
(Missing)107699
58.3%
ValueCountFrequency (%)
186
 
< 0.1%
1931
 
< 0.1%
20127
 
0.1%
21205
 
0.1%
22330
 
0.2%
23442
 
0.2%
24564
0.3%
25876
0.5%
261030
0.6%
271225
0.7%
ValueCountFrequency (%)
902
 
< 0.1%
891
 
< 0.1%
882
 
< 0.1%
871
 
< 0.1%
8514
 
< 0.1%
8441
< 0.1%
8348
< 0.1%
8220
< 0.1%
8149
< 0.1%
8044
< 0.1%

municipio_residencia
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
ARAUCA
74877 
TAME
56768 
SARAVENA
29913 
Otros
12457 
ARAUQUITA
10820 

Length

Max length9
Median length6
Mean length5.817637352
Min length4

Characters and Unicode

Total characters1075303
Distinct characters17
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSARAVENA
2nd rowARAUQUITA
3rd rowARAUQUITA
4th rowARAUCA
5th rowARAUCA

Common Values

ValueCountFrequency (%)
ARAUCA74877
40.5%
TAME56768
30.7%
SARAVENA29913
 
16.2%
Otros12457
 
6.7%
ARAUQUITA10820
 
5.9%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
arauca74877
40.5%
tame56768
30.7%
saravena29913
 
16.2%
otros12457
 
6.7%
arauquita10820
 
5.9%

Most occurring characters

ValueCountFrequency (%)
A403598
37.5%
R115610
 
10.8%
U96517
 
9.0%
E86681
 
8.1%
C74877
 
7.0%
T67588
 
6.3%
M56768
 
5.3%
S29913
 
2.8%
V29913
 
2.8%
N29913
 
2.8%
Other values (7)83925
 
7.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter1025475
95.4%
Lowercase Letter49828
 
4.6%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A403598
39.4%
R115610
 
11.3%
U96517
 
9.4%
E86681
 
8.5%
C74877
 
7.3%
T67588
 
6.6%
M56768
 
5.5%
S29913
 
2.9%
V29913
 
2.9%
N29913
 
2.9%
Other values (3)34097
 
3.3%
Lowercase Letter
ValueCountFrequency (%)
t12457
25.0%
r12457
25.0%
o12457
25.0%
s12457
25.0%

Most occurring scripts

ValueCountFrequency (%)
Latin1075303
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
A403598
37.5%
R115610
 
10.8%
U96517
 
9.0%
E86681
 
8.1%
C74877
 
7.0%
T67588
 
6.3%
M56768
 
5.3%
S29913
 
2.8%
V29913
 
2.8%
N29913
 
2.8%
Other values (7)83925
 
7.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII1075303
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A403598
37.5%
R115610
 
10.8%
U96517
 
9.0%
E86681
 
8.1%
C74877
 
7.0%
T67588
 
6.3%
M56768
 
5.3%
S29913
 
2.8%
V29913
 
2.8%
N29913
 
2.8%
Other values (7)83925
 
7.8%

municipio_nacimiento
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing11
Missing (%)< 0.1%
Memory size1.4 MiB
ARAUCA
118496 
Otros
32605 
TAME
18923 
ARAUQUITA
 
8085
SARAVENA
 
6715

Length

Max length9
Median length6
Mean length5.822717829
Min length4

Characters and Unicode

Total characters1076178
Distinct characters17
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowARAUQUITA
2nd rowARAUQUITA
3rd rowOtros
4th rowARAUQUITA
5th rowARAUCA

Common Values

ValueCountFrequency (%)
ARAUCA118496
64.1%
Otros32605
 
17.6%
TAME18923
 
10.2%
ARAUQUITA8085
 
4.4%
SARAVENA6715
 
3.6%
(Missing)11
 
< 0.1%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
arauca118496
64.1%
otros32605
 
17.6%
tame18923
 
10.2%
arauquita8085
 
4.4%
saravena6715
 
3.6%

Most occurring characters

ValueCountFrequency (%)
A418811
38.9%
U134666
 
12.5%
R133296
 
12.4%
C118496
 
11.0%
O32605
 
3.0%
t32605
 
3.0%
r32605
 
3.0%
o32605
 
3.0%
s32605
 
3.0%
T27008
 
2.5%
Other values (7)80876
 
7.5%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter945758
87.9%
Lowercase Letter130420
 
12.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A418811
44.3%
U134666
 
14.2%
R133296
 
14.1%
C118496
 
12.5%
O32605
 
3.4%
T27008
 
2.9%
E25638
 
2.7%
M18923
 
2.0%
Q8085
 
0.9%
I8085
 
0.9%
Other values (3)20145
 
2.1%
Lowercase Letter
ValueCountFrequency (%)
t32605
25.0%
r32605
25.0%
o32605
25.0%
s32605
25.0%

Most occurring scripts

ValueCountFrequency (%)
Latin1076178
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
A418811
38.9%
U134666
 
12.5%
R133296
 
12.4%
C118496
 
11.0%
O32605
 
3.0%
t32605
 
3.0%
r32605
 
3.0%
o32605
 
3.0%
s32605
 
3.0%
T27008
 
2.5%
Other values (7)80876
 
7.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII1076178
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A418811
38.9%
U134666
 
12.5%
R133296
 
12.4%
C118496
 
11.0%
O32605
 
3.0%
t32605
 
3.0%
r32605
 
3.0%
o32605
 
3.0%
s32605
 
3.0%
T27008
 
2.5%
Other values (7)80876
 
7.5%

municipio_expedicion
Categorical

HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct5
Distinct (%)< 0.1%
Missing93581
Missing (%)50.6%
Memory size1.4 MiB
ARAUCA
36144 
Otros
30948 
TAME
14834 
ARAUQUITA
6007 
SARAVENA
 
3321

Length

Max length9
Median length5
Mean length5.606011791
Min length4

Characters and Unicode

Total characters511571
Distinct characters17
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowARAUQUITA
2nd rowARAUQUITA
3rd rowOtros
4th rowARAUCA
5th rowARAUCA

Common Values

ValueCountFrequency (%)
ARAUCA36144
 
19.6%
Otros30948
 
16.7%
TAME14834
 
8.0%
ARAUQUITA6007
 
3.2%
SARAVENA3321
 
1.8%
(Missing)93581
50.6%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
arauca36144
39.6%
otros30948
33.9%
tame14834
16.3%
arauquita6007
 
6.6%
saravena3321
 
3.6%

Most occurring characters

ValueCountFrequency (%)
A151250
29.6%
U48158
 
9.4%
R45472
 
8.9%
C36144
 
7.1%
O30948
 
6.0%
t30948
 
6.0%
r30948
 
6.0%
o30948
 
6.0%
s30948
 
6.0%
T20841
 
4.1%
Other values (7)54966
 
10.7%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter387779
75.8%
Lowercase Letter123792
 
24.2%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A151250
39.0%
U48158
 
12.4%
R45472
 
11.7%
C36144
 
9.3%
O30948
 
8.0%
T20841
 
5.4%
E18155
 
4.7%
M14834
 
3.8%
Q6007
 
1.5%
I6007
 
1.5%
Other values (3)9963
 
2.6%
Lowercase Letter
ValueCountFrequency (%)
t30948
25.0%
r30948
25.0%
o30948
25.0%
s30948
25.0%

Most occurring scripts

ValueCountFrequency (%)
Latin511571
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
A151250
29.6%
U48158
 
9.4%
R45472
 
8.9%
C36144
 
7.1%
O30948
 
6.0%
t30948
 
6.0%
r30948
 
6.0%
o30948
 
6.0%
s30948
 
6.0%
T20841
 
4.1%
Other values (7)54966
 
10.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII511571
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A151250
29.6%
U48158
 
9.4%
R45472
 
8.9%
C36144
 
7.1%
O30948
 
6.0%
t30948
 
6.0%
r30948
 
6.0%
o30948
 
6.0%
s30948
 
6.0%
T20841
 
4.1%
Other values (7)54966
 
10.7%

tiene_casa_propia
Categorical

MISSING

Distinct2
Distinct (%)< 0.1%
Missing100463
Missing (%)54.4%
Memory size1.4 MiB
Si
61942 
No
22430 

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters168744
Distinct characters4
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSi
2nd rowNo
3rd rowNo
4th rowNo
5th rowNo

Common Values

ValueCountFrequency (%)
Si61942
33.5%
No22430
 
12.1%
(Missing)100463
54.4%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
si61942
73.4%
no22430
 
26.6%

Most occurring characters

ValueCountFrequency (%)
S61942
36.7%
i61942
36.7%
N22430
 
13.3%
o22430
 
13.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter84372
50.0%
Lowercase Letter84372
50.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S61942
73.4%
N22430
 
26.6%
Lowercase Letter
ValueCountFrequency (%)
i61942
73.4%
o22430
 
26.6%

Most occurring scripts

ValueCountFrequency (%)
Latin168744
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
S61942
36.7%
i61942
36.7%
N22430
 
13.3%
o22430
 
13.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII168744
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S61942
36.7%
i61942
36.7%
N22430
 
13.3%
o22430
 
13.3%

sueldo_smdlv
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct520
Distinct (%)0.7%
Missing111082
Missing (%)60.1%
Infinite0
Infinite (%)0.0%
Mean103.9070004
Minimum3
Maximum600
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.4 MiB

Quantile statistics

Minimum3
5-th percentile27
Q144
median73
Q3130
95-th percentile291
Maximum600
Range597
Interquartile range (IQR)86

Descriptive statistics

Standard deviation90.78111957
Coefficient of variation (CV)0.8736766457
Kurtosis6.886391299
Mean103.9070004
Median Absolute Deviation (MAD)35
Skewness2.312361054
Sum7663453
Variance8241.21167
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
431636
 
0.9%
341573
 
0.9%
361488
 
0.8%
461373
 
0.7%
611366
 
0.7%
381334
 
0.7%
761317
 
0.7%
691268
 
0.7%
721229
 
0.7%
651204
 
0.7%
Other values (510)59965
32.4%
(Missing)111082
60.1%
ValueCountFrequency (%)
36
 
< 0.1%
43
 
< 0.1%
513
 
< 0.1%
621
 
< 0.1%
754
< 0.1%
825
 
< 0.1%
927
 
< 0.1%
10133
0.1%
1146
 
< 0.1%
1245
 
< 0.1%
ValueCountFrequency (%)
600294
0.2%
5991
 
< 0.1%
5971
 
< 0.1%
5931
 
< 0.1%
58844
 
< 0.1%
5866
 
< 0.1%
5847
 
< 0.1%
58210
 
< 0.1%
5781
 
< 0.1%
57615
 
< 0.1%

otros_ingresos
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
SIN OTROS INGRESOS
179418 
CON OTROS INGRESOS
 
5417

Length

Max length18
Median length18
Mean length18
Min length18

Characters and Unicode

Total characters3327030
Distinct characters10
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSIN OTROS INGRESOS
2nd rowSIN OTROS INGRESOS
3rd rowSIN OTROS INGRESOS
4th rowSIN OTROS INGRESOS
5th rowSIN OTROS INGRESOS

Common Values

ValueCountFrequency (%)
SIN OTROS INGRESOS179418
97.1%
CON OTROS INGRESOS5417
 
2.9%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
otros184835
33.3%
ingresos184835
33.3%
sin179418
32.4%
con5417
 
1.0%

Most occurring characters

ValueCountFrequency (%)
S733923
22.1%
O559922
16.8%
N369670
11.1%
369670
11.1%
R369670
11.1%
I364253
10.9%
T184835
 
5.6%
G184835
 
5.6%
E184835
 
5.6%
C5417
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter2957360
88.9%
Space Separator369670
 
11.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S733923
24.8%
O559922
18.9%
N369670
12.5%
R369670
12.5%
I364253
12.3%
T184835
 
6.2%
G184835
 
6.2%
E184835
 
6.2%
C5417
 
0.2%
Space Separator
ValueCountFrequency (%)
369670
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin2957360
88.9%
Common369670
 
11.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
S733923
24.8%
O559922
18.9%
N369670
12.5%
R369670
12.5%
I364253
12.3%
T184835
 
6.2%
G184835
 
6.2%
E184835
 
6.2%
C5417
 
0.2%
Common
ValueCountFrequency (%)
369670
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII3327030
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S733923
22.1%
O559922
16.8%
N369670
11.1%
369670
11.1%
R369670
11.1%
I364253
10.9%
T184835
 
5.6%
G184835
 
5.6%
E184835
 
5.6%
C5417
 
0.2%

otros_ingresos_smdlv
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct212
Distinct (%)3.9%
Missing179418
Missing (%)97.1%
Infinite0
Infinite (%)0.0%
Mean54.56839579
Minimum1
Maximum300
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.4 MiB

Quantile statistics

Minimum1
5-th percentile7
Q120
median34
Q363
95-th percentile194
Maximum300
Range299
Interquartile range (IQR)43

Descriptive statistics

Standard deviation59.05556072
Coefficient of variation (CV)1.082230105
Kurtosis5.983563774
Mean54.56839579
Median Absolute Deviation (MAD)17
Skewness2.397412022
Sum295597
Variance3487.559252
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
36164
 
0.1%
34163
 
0.1%
18158
 
0.1%
23131
 
0.1%
17131
 
0.1%
20130
 
0.1%
21129
 
0.1%
30118
 
0.1%
13113
 
0.1%
26111
 
0.1%
Other values (202)4069
 
2.2%
(Missing)179418
97.1%
ValueCountFrequency (%)
17
 
< 0.1%
340
< 0.1%
442
< 0.1%
540
< 0.1%
650
< 0.1%
797
0.1%
865
< 0.1%
947
< 0.1%
1094
0.1%
1150
< 0.1%
ValueCountFrequency (%)
30090
< 0.1%
2948
 
< 0.1%
29111
 
< 0.1%
2883
 
< 0.1%
2802
 
< 0.1%
2761
 
< 0.1%
2717
 
< 0.1%
2682
 
< 0.1%
2644
 
< 0.1%
2622
 
< 0.1%

municipio_credito
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
ARAUCA
74131 
TAME
50656 
SARAVENA
24984 
ARAUQUITA
18279 
Otros
16785 

Length

Max length9
Median length6
Mean length5.928087213
Min length4

Characters and Unicode

Total characters1095718
Distinct characters17
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSARAVENA
2nd rowARAUQUITA
3rd rowARAUQUITA
4th rowSARAVENA
5th rowARAUCA

Common Values

ValueCountFrequency (%)
ARAUCA74131
40.1%
TAME50656
27.4%
SARAVENA24984
 
13.5%
ARAUQUITA18279
 
9.9%
Otros16785
 
9.1%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
arauca74131
40.1%
tame50656
27.4%
saravena24984
 
13.5%
arauquita18279
 
9.9%
otros16785
 
9.1%

Most occurring characters

ValueCountFrequency (%)
A402838
36.8%
R117394
 
10.7%
U110689
 
10.1%
E75640
 
6.9%
C74131
 
6.8%
T68935
 
6.3%
M50656
 
4.6%
S24984
 
2.3%
V24984
 
2.3%
N24984
 
2.3%
Other values (7)120483
 
11.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter1028578
93.9%
Lowercase Letter67140
 
6.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A402838
39.2%
R117394
 
11.4%
U110689
 
10.8%
E75640
 
7.4%
C74131
 
7.2%
T68935
 
6.7%
M50656
 
4.9%
S24984
 
2.4%
V24984
 
2.4%
N24984
 
2.4%
Other values (3)53343
 
5.2%
Lowercase Letter
ValueCountFrequency (%)
t16785
25.0%
r16785
25.0%
o16785
25.0%
s16785
25.0%

Most occurring scripts

ValueCountFrequency (%)
Latin1095718
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
A402838
36.8%
R117394
 
10.7%
U110689
 
10.1%
E75640
 
6.9%
C74131
 
6.8%
T68935
 
6.3%
M50656
 
4.6%
S24984
 
2.3%
V24984
 
2.3%
N24984
 
2.3%
Other values (7)120483
 
11.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII1095718
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A402838
36.8%
R117394
 
10.7%
U110689
 
10.1%
E75640
 
6.9%
C74131
 
6.8%
T68935
 
6.3%
M50656
 
4.6%
S24984
 
2.3%
V24984
 
2.3%
N24984
 
2.3%
Other values (7)120483
 
11.0%

codeudor
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
SIN CODEUDOR
157439 
CON CODEUDOR
27396 

Length

Max length12
Median length12
Mean length12
Min length12

Characters and Unicode

Total characters2218020
Distinct characters10
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSIN CODEUDOR
2nd rowCON CODEUDOR
3rd rowCON CODEUDOR
4th rowCON CODEUDOR
5th rowCON CODEUDOR

Common Values

ValueCountFrequency (%)
SIN CODEUDOR157439
85.2%
CON CODEUDOR27396
 
14.8%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
codeudor184835
50.0%
sin157439
42.6%
con27396
 
7.4%

Most occurring characters

ValueCountFrequency (%)
O397066
17.9%
D369670
16.7%
C212231
9.6%
N184835
8.3%
184835
8.3%
E184835
8.3%
U184835
8.3%
R184835
8.3%
S157439
 
7.1%
I157439
 
7.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter2033185
91.7%
Space Separator184835
 
8.3%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
O397066
19.5%
D369670
18.2%
C212231
10.4%
N184835
9.1%
E184835
9.1%
U184835
9.1%
R184835
9.1%
S157439
 
7.7%
I157439
 
7.7%
Space Separator
ValueCountFrequency (%)
184835
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin2033185
91.7%
Common184835
 
8.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
O397066
19.5%
D369670
18.2%
C212231
10.4%
N184835
9.1%
E184835
9.1%
U184835
9.1%
R184835
9.1%
S157439
 
7.7%
I157439
 
7.7%
Common
ValueCountFrequency (%)
184835
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII2218020
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
O397066
17.9%
D369670
16.7%
C212231
9.6%
N184835
8.3%
184835
8.3%
E184835
8.3%
U184835
8.3%
R184835
8.3%
S157439
 
7.1%
I157439
 
7.1%

sector
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing18
Missing (%)< 0.1%
Memory size1.4 MiB
PRIVADO
166537 
PUBLICO
18280 

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters1293719
Distinct characters11
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPRIVADO
2nd rowPRIVADO
3rd rowPRIVADO
4th rowPRIVADO
5th rowPRIVADO

Common Values

ValueCountFrequency (%)
PRIVADO166537
90.1%
PUBLICO18280
 
9.9%
(Missing)18
 
< 0.1%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
privado166537
90.1%
publico18280
 
9.9%

Most occurring characters

ValueCountFrequency (%)
P184817
14.3%
I184817
14.3%
O184817
14.3%
R166537
12.9%
V166537
12.9%
A166537
12.9%
D166537
12.9%
U18280
 
1.4%
B18280
 
1.4%
L18280
 
1.4%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter1293719
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
P184817
14.3%
I184817
14.3%
O184817
14.3%
R166537
12.9%
V166537
12.9%
A166537
12.9%
D166537
12.9%
U18280
 
1.4%
B18280
 
1.4%
L18280
 
1.4%

Most occurring scripts

ValueCountFrequency (%)
Latin1293719
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
P184817
14.3%
I184817
14.3%
O184817
14.3%
R166537
12.9%
V166537
12.9%
A166537
12.9%
D166537
12.9%
U18280
 
1.4%
B18280
 
1.4%
L18280
 
1.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII1293719
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
P184817
14.3%
I184817
14.3%
O184817
14.3%
R166537
12.9%
V166537
12.9%
A166537
12.9%
D166537
12.9%
U18280
 
1.4%
B18280
 
1.4%
L18280
 
1.4%

año_credito
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct26
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2012.337772
Minimum1996
Maximum2021
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.4 MiB

Quantile statistics

Minimum1996
5-th percentile2001
Q12009
median2014
Q32017
95-th percentile2019
Maximum2021
Range25
Interquartile range (IQR)8

Descriptive statistics

Standard deviation5.511452785
Coefficient of variation (CV)0.002738830858
Kurtosis-0.001479673463
Mean2012.337772
Median Absolute Deviation (MAD)3
Skewness-0.8275903629
Sum371950452
Variance30.3761118
MonotonicityNot monotonic
Histogram with fixed size bins (bins=26)
ValueCountFrequency (%)
201517560
 
9.5%
201714979
 
8.1%
201414246
 
7.7%
201614088
 
7.6%
201814069
 
7.6%
201312716
 
6.9%
201211756
 
6.4%
201110809
 
5.8%
201910389
 
5.6%
20108871
 
4.8%
Other values (16)55352
29.9%
ValueCountFrequency (%)
19961
 
< 0.1%
19971144
 
0.6%
19981998
1.1%
19992649
1.4%
20002552
1.4%
20012858
1.5%
20022807
1.5%
20032812
1.5%
20043661
2.0%
20054579
2.5%
ValueCountFrequency (%)
2021511
 
0.3%
20207337
4.0%
201910389
5.6%
201814069
7.6%
201714979
8.1%
201614088
7.6%
201517560
9.5%
201414246
7.7%
201312716
6.9%
201211756
6.4%

mes_credito
Real number (ℝ≥0)

Distinct12
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.882717018
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.4 MiB

Quantile statistics

Minimum1
5-th percentile1
Q14
median7
Q310
95-th percentile12
Maximum12
Range11
Interquartile range (IQR)6

Descriptive statistics

Standard deviation3.48600663
Coefficient of variation (CV)0.5064869907
Kurtosis-1.213905701
Mean6.882717018
Median Absolute Deviation (MAD)3
Skewness-0.1128749824
Sum1272167
Variance12.15224223
MonotonicityNot monotonic
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
1221172
11.5%
1017498
9.5%
915848
8.6%
1115580
8.4%
515425
8.3%
715339
8.3%
814910
8.1%
614777
8.0%
314112
7.6%
413594
7.4%
Other values (2)26580
14.4%
ValueCountFrequency (%)
113201
7.1%
213379
7.2%
314112
7.6%
413594
7.4%
515425
8.3%
614777
8.0%
715339
8.3%
814910
8.1%
915848
8.6%
1017498
9.5%
ValueCountFrequency (%)
1221172
11.5%
1115580
8.4%
1017498
9.5%
915848
8.6%
814910
8.1%
715339
8.3%
614777
8.0%
515425
8.3%
413594
7.4%
314112
7.6%

valor_credito_smdlv
Real number (ℝ≥0)

HIGH CORRELATION

Distinct660
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean66.48601726
Minimum0
Maximum700
Zeros1558
Zeros (%)0.8%
Negative0
Negative (%)0.0%
Memory size1.4 MiB

Quantile statistics

Minimum0
5-th percentile5
Q119
median45
Q378
95-th percentile226
Maximum700
Range700
Interquartile range (IQR)59

Descriptive statistics

Standard deviation78.39812045
Coefficient of variation (CV)1.179167044
Kurtosis14.67234465
Mean66.48601726
Median Absolute Deviation (MAD)28
Skewness3.155549206
Sum12288943
Variance6146.26529
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
64426
 
2.4%
73892
 
2.1%
162930
 
1.6%
152674
 
1.4%
82644
 
1.4%
182588
 
1.4%
172555
 
1.4%
52544
 
1.4%
142337
 
1.3%
192235
 
1.2%
Other values (650)156010
84.4%
ValueCountFrequency (%)
01558
 
0.8%
12172
1.2%
22133
1.2%
31527
 
0.8%
41734
 
0.9%
52544
1.4%
64426
2.4%
73892
2.1%
82644
1.4%
92050
1.1%
ValueCountFrequency (%)
700334
0.2%
6981
 
< 0.1%
6972
 
< 0.1%
6961
 
< 0.1%
6951
 
< 0.1%
6942
 
< 0.1%
6931
 
< 0.1%
6912
 
< 0.1%
6871
 
< 0.1%
6861
 
< 0.1%

tipo_venta
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
ELECTRODOMESTICOS
183951 
MOTOS
 
865
CONTRATO
 
19

Length

Max length17
Median length17
Mean length16.94291666
Min length5

Characters and Unicode

Total characters3131644
Distinct characters12
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowELECTRODOMESTICOS
2nd rowELECTRODOMESTICOS
3rd rowELECTRODOMESTICOS
4th rowMOTOS
5th rowELECTRODOMESTICOS

Common Values

ValueCountFrequency (%)
ELECTRODOMESTICOS183951
99.5%
MOTOS865
 
0.5%
CONTRATO19
 
< 0.1%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
electrodomesticos183951
99.5%
motos865
 
0.5%
contrato19
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
O553621
17.7%
E551853
17.6%
T368805
11.8%
S368767
11.8%
C367921
11.7%
M184816
 
5.9%
R183970
 
5.9%
L183951
 
5.9%
D183951
 
5.9%
I183951
 
5.9%
Other values (2)38
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter3131644
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
O553621
17.7%
E551853
17.6%
T368805
11.8%
S368767
11.8%
C367921
11.7%
M184816
 
5.9%
R183970
 
5.9%
L183951
 
5.9%
D183951
 
5.9%
I183951
 
5.9%
Other values (2)38
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin3131644
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
O553621
17.7%
E551853
17.6%
T368805
11.8%
S368767
11.8%
C367921
11.7%
M184816
 
5.9%
R183970
 
5.9%
L183951
 
5.9%
D183951
 
5.9%
I183951
 
5.9%
Other values (2)38
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII3131644
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
O553621
17.7%
E551853
17.6%
T368805
11.8%
S368767
11.8%
C367921
11.7%
M184816
 
5.9%
R183970
 
5.9%
L183951
 
5.9%
D183951
 
5.9%
I183951
 
5.9%
Other values (2)38
 
< 0.1%

periodo_credito
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
MENSUAL(ES)
172628 
DIARIA(S)
 
9833
SEMANAL(ES)
 
1758
QUINCENAL(ES)
 
616

Length

Max length13
Median length11
Mean length10.90026781
Min length9

Characters and Unicode

Total characters2014751
Distinct characters14
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowQUINCENAL(ES)
2nd rowMENSUAL(ES)
3rd rowMENSUAL(ES)
4th rowMENSUAL(ES)
5th rowMENSUAL(ES)

Common Values

ValueCountFrequency (%)
MENSUAL(ES)172628
93.4%
DIARIA(S)9833
 
5.3%
SEMANAL(ES)1758
 
1.0%
QUINCENAL(ES)616
 
0.3%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
mensual(es172628
93.4%
diaria(s9833
 
5.3%
semanal(es1758
 
1.0%
quincenal(es616
 
0.3%

Most occurring characters

ValueCountFrequency (%)
S359221
17.8%
E350004
17.4%
A196426
9.7%
(184835
9.2%
)184835
9.2%
N175618
8.7%
L175002
8.7%
M174386
8.7%
U173244
8.6%
I20282
 
1.0%
Other values (4)20898
 
1.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter1645081
81.7%
Open Punctuation184835
 
9.2%
Close Punctuation184835
 
9.2%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S359221
21.8%
E350004
21.3%
A196426
11.9%
N175618
10.7%
L175002
10.6%
M174386
10.6%
U173244
10.5%
I20282
 
1.2%
D9833
 
0.6%
R9833
 
0.6%
Other values (2)1232
 
0.1%
Open Punctuation
ValueCountFrequency (%)
(184835
100.0%
Close Punctuation
ValueCountFrequency (%)
)184835
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin1645081
81.7%
Common369670
 
18.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
S359221
21.8%
E350004
21.3%
A196426
11.9%
N175618
10.7%
L175002
10.6%
M174386
10.6%
U173244
10.5%
I20282
 
1.2%
D9833
 
0.6%
R9833
 
0.6%
Other values (2)1232
 
0.1%
Common
ValueCountFrequency (%)
(184835
50.0%
)184835
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII2014751
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S359221
17.8%
E350004
17.4%
A196426
9.7%
(184835
9.2%
)184835
9.2%
N175618
8.7%
L175002
8.7%
M174386
8.7%
U173244
8.6%
I20282
 
1.0%
Other values (4)20898
 
1.0%

cuotas
Real number (ℝ≥0)

HIGH CORRELATION

Distinct15
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.217723916
Minimum0
Maximum14
Zeros2
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size1.4 MiB

Quantile statistics

Minimum0
5-th percentile1
Q11
median4
Q310
95-th percentile14
Maximum14
Range14
Interquartile range (IQR)9

Descriptive statistics

Standard deviation4.460751515
Coefficient of variation (CV)0.8549228719
Kurtosis-1.024503967
Mean5.217723916
Median Absolute Deviation (MAD)3
Skewness0.6368040254
Sum964418
Variance19.89830408
MonotonicityNot monotonic
Histogram with fixed size bins (bins=15)
ValueCountFrequency (%)
168639
37.1%
1022809
 
12.3%
514831
 
8.0%
1414147
 
7.7%
310898
 
5.9%
610575
 
5.7%
210538
 
5.7%
128176
 
4.4%
48007
 
4.3%
95635
 
3.0%
Other values (5)10580
 
5.7%
ValueCountFrequency (%)
02
 
< 0.1%
168639
37.1%
210538
 
5.7%
310898
 
5.9%
48007
 
4.3%
514831
 
8.0%
610575
 
5.7%
72006
 
1.1%
84059
 
2.2%
95635
 
3.0%
ValueCountFrequency (%)
1414147
7.7%
13622
 
0.3%
128176
 
4.4%
113891
 
2.1%
1022809
12.3%
95635
 
3.0%
84059
 
2.2%
72006
 
1.1%
610575
5.7%
514831
8.0%

forma_pago
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
CRÉDITO
178980 
LIBRANZA
 
5855

Length

Max length8
Median length7
Mean length7.031676901
Min length7

Characters and Unicode

Total characters1299700
Distinct characters12
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCRÉDITO
2nd rowCRÉDITO
3rd rowCRÉDITO
4th rowCRÉDITO
5th rowCRÉDITO

Common Values

ValueCountFrequency (%)
CRÉDITO178980
96.8%
LIBRANZA5855
 
3.2%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
crédito178980
96.8%
libranza5855
 
3.2%

Most occurring characters

ValueCountFrequency (%)
R184835
14.2%
I184835
14.2%
C178980
13.8%
É178980
13.8%
D178980
13.8%
T178980
13.8%
O178980
13.8%
A11710
 
0.9%
L5855
 
0.5%
B5855
 
0.5%
Other values (2)11710
 
0.9%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter1299700
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
R184835
14.2%
I184835
14.2%
C178980
13.8%
É178980
13.8%
D178980
13.8%
T178980
13.8%
O178980
13.8%
A11710
 
0.9%
L5855
 
0.5%
B5855
 
0.5%
Other values (2)11710
 
0.9%

Most occurring scripts

ValueCountFrequency (%)
Latin1299700
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
R184835
14.2%
I184835
14.2%
C178980
13.8%
É178980
13.8%
D178980
13.8%
T178980
13.8%
O178980
13.8%
A11710
 
0.9%
L5855
 
0.5%
B5855
 
0.5%
Other values (2)11710
 
0.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII1120720
86.2%
Latin 1 Sup178980
 
13.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
R184835
16.5%
I184835
16.5%
C178980
16.0%
D178980
16.0%
T178980
16.0%
O178980
16.0%
A11710
 
1.0%
L5855
 
0.5%
B5855
 
0.5%
N5855
 
0.5%
Latin 1 Sup
ValueCountFrequency (%)
É178980
100.0%

tipo_cliente
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
3
94856 
1
70297 
2
15662 
4
 
4020

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters184835
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row3
3rd row1
4th row3
5th row2

Common Values

ValueCountFrequency (%)
394856
51.3%
170297
38.0%
215662
 
8.5%
44020
 
2.2%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
394856
51.3%
170297
38.0%
215662
 
8.5%
44020
 
2.2%

Most occurring characters

ValueCountFrequency (%)
394856
51.3%
170297
38.0%
215662
 
8.5%
44020
 
2.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number184835
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
394856
51.3%
170297
38.0%
215662
 
8.5%
44020
 
2.2%

Most occurring scripts

ValueCountFrequency (%)
Common184835
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
394856
51.3%
170297
38.0%
215662
 
8.5%
44020
 
2.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII184835
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
394856
51.3%
170297
38.0%
215662
 
8.5%
44020
 
2.2%

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

Rowprocedenciageneroestado_civiledadmunicipio_residenciamunicipio_nacimientomunicipio_expediciontiene_casa_propiasueldo_smdlvotros_ingresosotros_ingresos_smdlvmunicipio_creditocodeudorsectoraño_creditomes_creditovalor_credito_smdlvtipo_ventaperiodo_creditocuotasforma_pagotipo_cliente
01NacionalNaNNaNNaNSARAVENANaNNaNNaNNaNSIN OTROS INGRESOSNaNSARAVENASIN CODEUDORPRIVADO202012356ELECTRODOMESTICOSQUINCENAL(ES)1CRÉDITO1
12NacionalNaNNaNNaNARAUQUITAARAUQUITAARAUQUITANaNNaNSIN OTROS INGRESOSNaNARAUQUITACON CODEUDORPRIVADO2018561ELECTRODOMESTICOSMENSUAL(ES)12CRÉDITO3
23NacionalNaNNaNNaNARAUQUITAARAUQUITAARAUQUITANaNNaNSIN OTROS INGRESOSNaNARAUQUITACON CODEUDORPRIVADO20196114ELECTRODOMESTICOSMENSUAL(ES)12CRÉDITO1
34NacionalNaNNaNNaNARAUCANaNNaNNaNNaNSIN OTROS INGRESOSNaNSARAVENACON CODEUDORPRIVADO201911349MOTOSMENSUAL(ES)12CRÉDITO3
45NacionalNaNNaN70.0ARAUCAOtrosOtrosSi170.0SIN OTROS INGRESOSNaNARAUCACON CODEUDORPRIVADO2020749ELECTRODOMESTICOSMENSUAL(ES)5CRÉDITO2
56NacionalNaNNaNNaNARAUCANaNNaNNaNNaNSIN OTROS INGRESOSNaNOtrosCON CODEUDORPRIVADO20191065ELECTRODOMESTICOSMENSUAL(ES)14CRÉDITO1
67NacionalNaNNaNNaNARAUQUITAARAUQUITAARAUCANaNNaNSIN OTROS INGRESOSNaNARAUQUITACON CODEUDORPRIVADO20171118ELECTRODOMESTICOSMENSUAL(ES)5CRÉDITO1
78NacionalNaNNaNNaNSARAVENAARAUCAARAUCANaNNaNSIN OTROS INGRESOSNaNSARAVENACON CODEUDORPRIVADO20071151ELECTRODOMESTICOSMENSUAL(ES)6CRÉDITO1
89NacionalNaNNaNNaNSARAVENAARAUCAARAUCANaNNaNSIN OTROS INGRESOSNaNSARAVENASIN CODEUDORPRIVADO20085118ELECTRODOMESTICOSMENSUAL(ES)10CRÉDITO1
910NacionalNaNNaNNaNSARAVENAARAUCAARAUCANaNNaNSIN OTROS INGRESOSNaNSARAVENASIN CODEUDORPRIVADO2012876ELECTRODOMESTICOSMENSUAL(ES)8CRÉDITO1

Last rows

Rowprocedenciageneroestado_civiledadmunicipio_residenciamunicipio_nacimientomunicipio_expediciontiene_casa_propiasueldo_smdlvotros_ingresosotros_ingresos_smdlvmunicipio_creditocodeudorsectoraño_creditomes_creditovalor_credito_smdlvtipo_ventaperiodo_creditocuotasforma_pagotipo_cliente
184825184826NacionalNaNNaNNaNTAMETAMENaNNaNNaNSIN OTROS INGRESOSNaNTAMESIN CODEUDORPRIVADO20031114ELECTRODOMESTICOSSEMANAL(ES)12CRÉDITO4
184826184827NacionalNaNNaNNaNTAMETAMENaNNaNNaNSIN OTROS INGRESOSNaNTAMESIN CODEUDORPRIVADO20031213ELECTRODOMESTICOSSEMANAL(ES)3CRÉDITO3
184827184828NacionalNaNNaNNaNTAMETAMENaNNaNNaNSIN OTROS INGRESOSNaNTAMESIN CODEUDORPRIVADO20031186ELECTRODOMESTICOSSEMANAL(ES)14CRÉDITO3
184828184829NacionalNaNNaNNaNSARAVENAARAUCANaNNaNNaNSIN OTROS INGRESOSNaNARAUQUITASIN CODEUDORPRIVADO201218ELECTRODOMESTICOSSEMANAL(ES)4CRÉDITO3
184829184830NacionalNaNNaNNaNSARAVENAARAUCANaNNaNNaNSIN OTROS INGRESOSNaNARAUQUITASIN CODEUDORPRIVADO2013417ELECTRODOMESTICOSSEMANAL(ES)14CRÉDITO1
184830184831NacionalNaNNaNNaNSARAVENAARAUCANaNNaNNaNSIN OTROS INGRESOSNaNARAUQUITASIN CODEUDORPRIVADO2012456ELECTRODOMESTICOSSEMANAL(ES)13CRÉDITO3
184831184832NacionalNaNNaNNaNSARAVENAARAUCANaNNaNNaNSIN OTROS INGRESOSNaNARAUQUITASIN CODEUDORPRIVADO2012180ELECTRODOMESTICOSSEMANAL(ES)14CRÉDITO3
184832184833NacionalNaNNaNNaNTAMESARAVENANaNNaNNaNSIN OTROS INGRESOSNaNSARAVENASIN CODEUDORPUBLICO20128132ELECTRODOMESTICOSSEMANAL(ES)14CRÉDITO3
184833184834NacionalNaNNaNNaNSARAVENAARAUQUITANaNNaNNaNSIN OTROS INGRESOSNaNARAUQUITASIN CODEUDORPRIVADO20121214ELECTRODOMESTICOSSEMANAL(ES)8CRÉDITO3
184834184835NacionalNaNNaNNaNARAUCAARAUCANaNNaNNaNSIN OTROS INGRESOSNaNARAUCACON CODEUDORPUBLICO2018689ELECTRODOMESTICOSSEMANAL(ES)14CRÉDITO3